-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kshitijk4poor/main #1304
base: main
Are you sure you want to change the base?
Kshitijk4poor/main #1304
Conversation
artemis/config.py
Outdated
@@ -698,6 +698,29 @@ class Nuclei: | |||
"NUCLEI_TEMPLATE_CHUNK_SIZE is 200, three calls will be made with 200 templates each.", | |||
] = get_config("NUCLEI_TEMPLATE_CHUNK_SIZE", default=200, cast=int) | |||
|
|||
class PlaceholderPagesHtmlElements: | |||
PLACEHOLDER_PAGE_HTML_ELEMENTS: Annotated[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is a good idea to extract this to a file (have config named "PLACEHOLDER_PAGE_CONTENT_FILENAME" and these contents being loaded from file)
artemis/config.py
Outdated
class PlaceholderPagesHtmlElements: | ||
PLACEHOLDER_PAGE_HTML_ELEMENTS: Annotated[ | ||
List[str], | ||
"A list of html elements that appear on placeholder pages", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe write that exact string matching will be performed, no parsing, no nothing. And write what will happen with such pages
artemis/module_base.py
Outdated
@@ -153,6 +154,11 @@ def check_domain_exists(self, domain: str) -> bool: | |||
bool: True if the domain exists, False otherwise. | |||
""" | |||
try: | |||
manager = AnalyzerManager(domain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name this class more verbosely, not AnalyzerManager, maybe PlaceholderPageDetector?
artemis/module_base.py
Outdated
@@ -153,6 +154,11 @@ def check_domain_exists(self, domain: str) -> bool: | |||
bool: True if the domain exists, False otherwise. | |||
""" | |||
try: | |||
manager = AnalyzerManager(domain) | |||
result = manager.run_analysis() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mabe just is_placeholder(domain)
artemis/config.py
Outdated
@@ -698,6 +698,29 @@ class Nuclei: | |||
"NUCLEI_TEMPLATE_CHUNK_SIZE is 200, three calls will be made with 200 templates each.", | |||
] = get_config("NUCLEI_TEMPLATE_CHUNK_SIZE", default=200, cast=int) | |||
|
|||
class PlaceholderPagesHtmlElements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO add a configuration variable to enable this behavior, default false - by default people may not want random webpages to be skipped
artemis/placeholder_page_detector.py
Outdated
if response: | ||
html_content = response.content | ||
with open( | ||
Config.Modules.PlaceholderPageContent.PLACEHOLDER_PAGE_CONTENT_FILENAME, "r", encoding="utf-8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will cause the file to be read on every check, this can be made more optimal by moving to init and constrycting a singleton instance (and calling is_placeholder on different urls)
artemis/placeholder_page_detector.py
Outdated
return True | ||
|
||
|
||
class PlaceholderPageDetector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think having two classes adds anything in terms of readability
artemis/module_base.py
Outdated
@@ -153,6 +154,12 @@ def check_domain_exists(self, domain: str) -> bool: | |||
bool: True if the domain exists, False otherwise. | |||
""" | |||
try: | |||
if Config.Modules.PlaceholderPageContent.PLACEHOLDER_PAGE_DETECTOR_ENABLE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLACEHOLDER_PAGE_DETECTOR_ENABLED or ENABLE_PLACEHOLDER_PAGE_DETECTOR
artemis/config.py
Outdated
class PlaceholderPageContent: | ||
PLACEHOLDER_PAGE_DETECTOR_ENABLE: Annotated[ | ||
bool, | ||
"Enable or disable placeholder pages detector." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space or newline between dot and A
artemis/config.py
Outdated
PLACEHOLDER_PAGE_DETECTOR_ENABLE: Annotated[ | ||
bool, | ||
"Enable or disable placeholder pages detector." | ||
"A strict string matching will be performed without any parsing or modifications. The string will be " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two sentences with the same meaning: "A strict string matching will be performed without any parsing or modifications." and "The string will ...".
artemis/config.py
Outdated
"Enable or disable placeholder pages detector." | ||
"A strict string matching will be performed without any parsing or modifications. The string will be " | ||
"matched exactly as provided, without applying any transformations or processing. If the page exists " | ||
"but the specified string is found within it, the page will not be scanned for vulnerabilities. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo "and the specified"
artemis/config.py
Outdated
class PlaceholderPageContent: | ||
PLACEHOLDER_PAGE_DETECTOR_ENABLE: Annotated[ | ||
bool, | ||
"Enable or disable placeholder pages detector." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add maybe "using this feature you may skip vulnerability scanning for websites that aren't built yet, but e.g. contain a hosting provider placeholder page"
artemis/config.py
Outdated
) | ||
PLACEHOLDER_PAGE_CONTENT_FILENAME: Annotated[ | ||
str, | ||
"Path to placeholder page content file.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opisz jeszcze format, one string per line
artemis/placeholder_page_detector.py
Outdated
PLACEHOLDER_PAGE_CONTENT_FILENAME = Config.Modules.PlaceholderPageContent.PLACEHOLDER_PAGE_CONTENT_FILENAME | ||
|
||
|
||
placeholder_page_content = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is constant as well, upcase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But after read the file, list will be modified.
artemis/placeholder_page_detector.py
Outdated
|
||
@staticmethod | ||
def check_response(domain: str) -> Any: | ||
if domain.startswith(("https://", "http://")): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
domain can't possibly start with https://, did you actually detect such situation? if no, IMO this btanch is redundant
artemis/placeholder_page_detector.py
Outdated
response = http_requests.get(url) | ||
except requests.RequestException: | ||
return False | ||
response.encoding = "utf-8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a remnant of beautiful soup - probably
No description provided.